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Estimating the temperature of a cold quantum system is difficult. Usually, one measures a well- 
understood thermal state and uses that prior knowledge to infer its temperature. In contrast, we 
introduce a method of thermometry that assumes minimal knowledge of the state of a system and 
is potentially non-destructive. Our method uses a universal temperature-dependence of the quench 
dynamics of an initially thermal system coupled to a qubit probe that follows from the Tasaki-Crooks 
theorem for non-equilibrium work distributions. We provide examples for a cold-atom system, in 
which our thermometry protocol may retain accuracy and precision at subnanokelvin temperatures. 

PACS numbers: 06.20.-f, 05.70.Ln, 03.65.Yz, 67.85.Pq 


I. INTRODUCTION 

Many technological applications utilizing quantum sys¬ 
tems, e.g. analogue quantum simulators fl], require pre¬ 
cise and accurate measurements of their temperature, 
making thermometry of quantum systems a fundamental 
task. Conventional thermometry proceeds by bringing a 
small probe of known temperature-dependence into equi¬ 
librium with a thermal system and then measuring that 
probe. For accurate thermometry, this requires that the 
characteristic energy of the probe is precisely known and 
tuned near the thermal energy pj . This can be challeng¬ 
ing at the low temperatures relevant for experiments on 
ultracold atoms. 

Instead, the temperature of ultracold gases is of¬ 
ten inferred by directly measuring observables whose 
temperature-dependence is well understood. For in¬ 
stance, time-of-flight imaging of the momentum distribu¬ 
tion is used to obtain the temperature of weakly interact¬ 
ing cold atoms HE!. Meanwhile, for strongly interacting 
atoms in a lattice, temperature is inferred from fluctu¬ 
ations of lattice-site occupation numbers [5j- However, 
such an approach is only feasible if the system Hamilto¬ 
nian or its thermal states are well characterized and suffi¬ 
ciently simple, so that the temperature-dependence of ob¬ 
servables can be calculated and compared with measure¬ 
ments. Unfortunately, these requirements are frequently 
unmet in cold-atom experiments, where the system can 
be strongly correlated and established perturbative or 
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numerical techniques typically fail. What is missing is 
a generic approach for thermometry of cold atoms that 
does not need prior understanding of the thermal state. 

To fill this gap we return to the idea of bringing probes 
into contact with an initially thermal quantum system, 
this time focusing on the ensuing non-equilibrium dynam¬ 
ics. This increasingly studied HU and potentially non¬ 
destructive approach to investigating quantum systems 
has been used to analyze the parameters m and spec¬ 
trum m of a Hamiltonian, and the non-Markovianity of 
an open quantum system Applications of this ap¬ 
proach to the thermometry of cold atoms EMU! show 
that focusing on non-equilibrium dynamics can avoid the 
requirement of precisely tuning the characteristic energy 
of the probe near the thermal energy. However, the par¬ 
ticular approaches put forward so far rely on the system 
having a well-understood Hamiltonian. 

In this article, we show how to use a non-equilibrium 
probe to infer the temperature of a cold-atom system, 
which may in principle have an arbitrary Hamiltonian. 
Our approach exploits the Tasaki-Crooks theorem P3h 
120] : a universal temperature-dependent relationship be¬ 
tween non-equilibrium work distributions that may be 
embedded in the state of a qubit probe [2TH23] . We 
demonstrate that this versatile method is naturally suited 
to thermometry of cold atomic gases, and is both accu¬ 
rate and robust in the presence of imperfect data. Impor¬ 
tantly, no detailed knowledge of the internal dynamics of 
the system is needed. The only requirement is control 
over the coupling between the system and the two states 
of the qubit thermometer, which is readily achieved by 
using an atomic impurity as the probe. Our protocol 
thus realizes near-ideal thermometry within its domain 
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of applicability, which corresponds to temperatures com¬ 
mensurate with or lower than the characteristic energy 
scales of the system. This is precisely the temperature 
regime of greatest interest for cold-atom physics, and also 
a challenging one for conventional thermometry. 

We gauge the accuracy and precision of our protocol 
by simulating its application to a paradigmatic ultracold- 
atom system. Specifically, we consider a Bose-Hubbard 
model (BHM) and localized impurity qubit, as could be 
realized by cold bosons in an optical lattice and, for ex¬ 
ample, two internal states of a separately-trapped atom 
of a different species . Our protocol maintains ac¬ 

curacy and precision to a few percent in all regimes inves¬ 
tigated. This includes when the thermal energy is one or 
two orders of magnitudes lower than the hopping energy 
of the BHM, where the latter might typically correspond 
to tens of nanokelvin. Moreover, it includes intermediate 
interaction strengths for which the nature of the ther¬ 
mal state is poorly understood and neither time-of-flight 
nor number-fluctuation measurements reveal the temper¬ 
ature. This work thus opens the door for thermometry of 
generic cold-atom systems at extreme temperatures and 
the technologies, e.g. quantum simulation, that require 
such thermometry. We begin with a general description 
of our scheme, before proceeding to a detailed study of 
its application to the BHM. 

II. DESCRIPTION OF THE PROTOCOL 


B. Qubit interferometry 


To directly measure quantum work distributions no¬ 
li, and thus their ratio R(W ), requires overcoming sig¬ 
nificant challenges, namely the realization of often pro¬ 
hibitively large numbers of difficult projective energy 
measurements [ 51 ] 132 ! • We instead consider an indi¬ 
rect approach to measuring R[W) using qubit interfer¬ 
ometry [2TH23] . The system of interest is brought into 
contact with a probe qubit, giving total Hamiltonian 
Hr(t) = —(A/2 )a z + Hs + Hi(t). Here A is the dif¬ 
ference in energy between the ground and excited states 
| j,) and | f) of the qubit, Hs is the Hamiltonian of the 
system of interest, and the interaction Hi(t) takes the 
form 

Hi{t) = (< 4 (t)imi +0r(*)ltxtl) ® V. (2) 

The combined system is initialized at time t = 0 in 
the state p = |s)(s| (g> /3 ( s(Aq( 0)), with the qubit in 
some superposition |s) = S|| j.) + s-|-| f)- This could 
be achieved by first reaching equilibrium with (3A 3> 1 
and </_!_(0) = g-|-(0) = Aq( 0), then applying a rotation 
<7 S = s^a x +s±a~ to the qubit. The state-dependent cou¬ 
plings g\,(t) and are then both varied according to 
the quench A Q(t) to be investigated, but with the latter 
delayed by a time u, i.e. 


A. Non-equilibrium work distributions 

Our thermometry protocol is based on a relationship 
between distributions of the work done by quenching a 
system away from equilibrium. We write Pq(W) for 
the distribution of the work W done on a system, e.g. 
a cold atomic gas, due to a quench Q. In the quench, 
the parameter A appearing in a system’s Hamiltonian 
H( A) = Hs + XV is varied as Ag(f) for t £ [0, r] 
driving the system away from an initial thermal state 
pp (Aq( 0)). Here /3 is the inverse temperature, pp( A) = 
e-OHM/Zpi A) is a thermal state of the system and 
Zp( A) = tr{exp[— (3H(A)}} is the corresponding partition 
function. 

The forward distribution Pp(W) for some quench 
A p(t) from A, to A/ is related to the backward distri¬ 
bution Pb(W) of its reverse As(f) = A f(t — t) from A / 
to A i via the Tasaki-Crooks relation [IMS 

In {R{W)} = In | ] = P{W - A F). (1) 

The ratio R(W) of the work distributions therefore de¬ 
pends only on [3 and one other constant, the free en¬ 
ergy difference A F = F(Af) — F(Aj), with F( A) = 
/3 _1 In [Zp{A)\. Note that this relation holds generally 
for a coherent quench; it is not based on assumptions of 
linear response or adiabaticity. 


9i{t) = | 
StO) = | 


Aq (t) 1 
Aq(t), 

Aq( 0), 
A Q (t — u), 


0 < t < r, 

T < t < T + U, 

0 < t < U, 

U < t < T + u. 


At time r + it, when both quenches are complete, the 
qubit has a reduced density operator, in units where H = 
1, 


p q =N 2 |'I')4I + 4 s 4-e iA(r+u) XQ(w)|IXt| 

+ s | St e- iA ^) X Q(«)|t)4l + ktl 2 lt)(t|. 

Here we have introduced the dephasing function 
X q(u) = tr {ule3 u ^V))u Qe -^ Q m M Aq ( 0 ))} , 

where Uq = Texp[—i dtH(Ag(t))] evolves the sys¬ 
tem according to the time-dependent quench Hamilto¬ 
nian H(AQ(t)) and T is the time-ordering operator. 

Close examination reveals that Xq{ u ) is none other 
than the characteristic function, or Fourier transform, of 
the work distribution mm 

P Q (W) - (2 n)- 1 ! d ue~ iWu x Q (u). 

Hence it is possible to measure Pq(W) from X q(w), and 
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the latter from expected values 

(6- x )+i (fry) =tr q {(a x +ia y )p q } = 2S|S t e _lA(T+ ” ) XQ(zi), 

for the qubit state [Eq. ([3])] at the end of the interference 
protocol. In what follows, we set sTs-j- = 1/2 and A = 0, 
with the more general case treated in the Supplemental 
Material. 


C. Thermometry 

The main result of this article is that we are able to use 
the above relations to systematically, precisely and ac¬ 
curately infer temperature from realistically noisy data, 
without any knowledge regarding the thermal states 
pp(\) of the quantum system. One needs only to have 
good control over a single qubit and its interaction with 
the system, which can be achieved by using an atomic 
impurity as the probe. 

Let us analyze these claims in order. Our protocol 
is robust in the presence of two fundamental sources 
of error, analyzed in detail in the Supplemental Mate¬ 
rial. First, it is only possible to estimate xq( u ) f° r a 
finite number of times Uj. Here we assume N steps times 
Uj = jT/N steps for j = 1,iVsteps- Second, each es¬ 
timate of (a x ) and (a y ) used to estimate Xq( u j) will 
have some error. Here we assume that errors are due 
to the finite number 7V meas of measurements used to es¬ 
timate each expectation value. However, other known 
qubit measurement errors can be treated in the same 
framework. 

The first point means that rather than estimating 
Pq (W), we instead estimate pq(W), a copy that is sub¬ 
ject to spectral leakage, due to the finite time-window 
T, and aliasing, due to the discreteness of Uj. We show 
that provided typically modest values of T and N steps 
are chosen such that tt/3/T is on the order of unity and 
T j Nsteps < TQdeph = 1 /<jq, then any effect on the ratio 
is negligible i.e. R(W) « pp(W)/ pb(— W) (see Supple¬ 
mental Material). Here TQ<j e P h is the delay u needed for 
the qubit to significantly dephase and is the inverse of 
<7q, the width of the work distribution Pq(W). 

The second point means our estimate of XQi u j ) will 
be an unbiased Gaussian random variable with variance 
(2 — \xQ(‘Uj)\ 2 )/N mea ,s, which then propagates linearly 
into an unbiased Gaussian estimate of pq(W) with vari¬ 
ance scaling as T 2 /N steps N meas . In a Bayesian approach 
detailed in the Supplemental Material, we show how this 
knowledge, together with the Tasaki-Crooks relation and 
the non-negativity of the work distributions, can be used 
to build the probability distribution V{0) for 0 given a 
set of estimates of XQ( u j)- It is a lso possible to include 
any prior knowledge of /3 and A F, though here we assume 
no prior knowledge. Our approach is found to be well cal¬ 
ibrated and accurate, to a few percent, given a modest 
number of times N steps and measurements N meas . 

The universality of the Tasaki-Crooks temperature- 


dependence allows the thermometry protocol to be ap¬ 
plied in complete ignorance of the quantum system’s 
thermal state /3^(A) and even its Hamiltonian Hg. It 
only needs to be ensured that both states of the impu¬ 
rity couple to the same operator V [Eq. ([ 2 ])], that the 
coupling strengths g\,{t) and gf(t) trace the same path 
with some delay, and that the backwards path mirrors 
the forward path. Such properties may be understood 
theoretically in advance or confirmed experimentally by 
observing how the qubit, in eigenstate | j.) or | j"), be¬ 
haves when interacting with the system. The choices 
of perturbing Hamiltonian V and quench Aq^) used for 
thermometry are arbitrary in principle, since they affect 
A F not 0. In fact, provided the relationships above hold, 
the actual values of V and Aq(£) need not be known. 

In practice, any thermometer benefits from optimiza¬ 
tion, and our protocol is no exception. In particular, the 
quench should be chosen so that the fluctuations of the 
non-equilibrium work are on the order of the temperature 
or larger. This ensures that the ratio of work distribu¬ 
tions, R(W) in Eq. 0 can be accurately and precisely 
inferred over a large enough range of values of the work 
W to extract a good straight-line fit for 0. Faced with an 
unknown system, the experimenter may therefore need 
to make some adjustments in order to find an appropri¬ 
ate quench. We emphasise that this optimization can 
be based purely on observations of the qubit evolution, 
without resorting to direct measurements on the system 
of interest. Nevertheless, it is clearly important to use a 
probe qubit whose interaction with the system is highly 
controllable and well understood over a range of energies. 

In the context of cold atomic gases, a probe qubit com¬ 
prising an atomic impurity satisfies these requirements 
well. In the following, we consider density-density cou¬ 
pling, as appropriate for an impurity interacting with a 
host gas of a different species. At low temperatures, this 
interaction is characterized by a small number of param¬ 
eters, such as s-wave scattering lengths, which may be 
accurately measured via independent experiments and 
controlled by means of external fields. Furthermore, 
the range of interaction energies between the impurity 
atom and the atomic gas naturally coincides with the 
characteristic energies of interaction between the gas 
atoms themselves. It is therefore apparent that the non¬ 
equilibrium work fluctuations induced by an atomic im¬ 
purity are well suited to characterize temperatures that 
are on the same order or smaller than the natural energy 
scales of the cold-atom system. 


III. EXAMPLE WITH COLD ATOMS IN A 
LATTICE 

For concreteness, from here on we take our system to 
be a cold atomic gas confined by an optical lattice, and 
the qubit to be formed by two internal states of an im¬ 
purity atom of a different species, which a strong trap 
localizes to density n q ( r) fMI - fTT j. Both impurity states 
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| 4-) and | t) couple via the contact interaction to the 
weighted density operator V = f drn g (r)'F'i(r)'I'(r) with 
different interaction strengths and g-f. Here 4/i(r) and 
4' (r) are the field operators for the atoms comprising the 
system. Thus it is possible to realize a combined Hamil¬ 
tonian Ht of the form required for our protocol. The 
qubit gate <r s , and the measurement of a x and a y can be 
performed e.g. using Rabi pulses combined with state- 
dependent fluorescing. 

The separate time-dependence of the couplings 
strengths g\,(t) and g-[{t) could be controlled by Fes- 
chbach resonances [24ft27j . Another option is to realize 
the time-dependence of the coupling strengths effectively 
by changing the properties of their trap, which may be 
state-selective. A further possibility is to forgo using in¬ 
ternal states to form the qubit, instead splitting the wave- 
function of an impurity, passing two copies through the 
system, and then interfering them [33j . 

For our specific example, we consider a one¬ 
dimensional Bose gas in a simple periodic lattice, which 
reduces to the Bose-Hubbard model {Ml 

m ' / jj \ 

h s =-jy :w 

in') 3 =i V J 

Here and a,j create and annihilate a particle at site 
j of a total of M, the hopping and interaction energies 
and chemical potential are written J, U and /i, respec¬ 
tively, and {jj') represents a sum over nearest neighbors. 
Assuming the impurity to be localized at a single central 
site c, the system’s interaction Hamiltonian is V = rja\a C) 
where ?y = J drn g (r)|'u> c (r)| 2 and w c (r) is the Wannier 
function at the central site. 


A. Superfluid phase 

In the superfluid regime nU/J <C 1, with n the num¬ 
ber of bosons per site, we can describe the system ap¬ 
proximately in terms of phononic Bogoliubov excitations 
above the uniform condensate of density n. To second or¬ 
der in these excitations and terms that create them, the 
system and interaction Hamiltonians simplify \‘Jo , up to 
a constant, to 

Hs 

k 

V =rjn + ^2{rf k b\ + r] k b k ). 
k 

Here, Sj) and b k create and annihilate a phonon at quasi¬ 
momentum k, Wfc = yf e k (e k + 2 Un) is the phonon dis¬ 
persion written in terms of single-particle energies e k = 
2J(1 — cos ka), and rj k = p^ne k /Mui k e~ lkac is the rela¬ 
tive coupling of each phonon mode to the impurity, with 
a the lattice parameter. 



p j pj 


FIG. 1: Superfluid phase, (a) Characteristic function Xf{u) 
of the forward quench, (b) Work distributions Pq{W). ( c ) 
The joint distribution V(f3,AF) inferred from the estimates 
of pq{W ), together with the corresponding marginal distri¬ 
butions V(/3) and V(AF). (d) The fractional standard devi¬ 
ation <7/3//I and bias [gp — /3)//5 of the /3 estimates observed 
over 1000 simulated experiments, plotted against the actual 
P. The values of T are shown in the figure. Unless stated oth¬ 
erwise, for all figures, the system parameters are M = 1000, 
U/J = 0.1 and n = 1, the quench parameters are A ig/J = 0, 
Xpg/J = 0.5, and rj = 1, and the protocol parameters are 
Nsteps = 500, IVmeas = 500, and TJ = 2.97T. 


With the Hamiltonians Hs and V in this form, 
the characteristic function xq( u ) can be calculated ex¬ 
actly (see Supplemental Material) from time-integrals of 
the form A Q k = co k dt\Q(t) exp _1 “ fet . The specific 
quench considered here is Aq(£) = Aq(0) + (Aq(t) — 
Aq( 0)) sin 2 (7rf/2r). 

We demonstrate the protocol first for a temperature 
corresponding to the typical energy scale of the system, 
fij = 1, with the results shown in Fig. [l] In Fig. [l|a) 
we have plotted the known characteristic function val¬ 
ues XF(uj) for the forward quench. Also shown are the 
estimated values and associated errors from a single sim¬ 
ulated experiment, consisting of 27V meas measurements 
at N g te ps times. In Fig. Bo we show the correspond¬ 
ing work distributions, both forward pf{W) and back¬ 
ward pb(W ), obtained from exact and estimated values 
of Xf{uj) and Xb{uj), again with error bars. Figurejljc) 
shows the joint distribution V{/3, A F) of /3 and A F con¬ 
ditioned upon the estimates of pq(W) obtained. From 
this, the marginal distribution V(/3) for /3, also shown, is 
calculated. The distribution of this example is consistent 
with the known value, containing uncertainty in f) of only 
a few percent. This can be reduced by increasing N steps 
Cl' Aeneas- 
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FIG. 2: Stronger interactions. The contents and parame¬ 
ters of this figure are identical to Fig. 0b) and (c), but for 
a smaller M = 11 system, featuring stronger interactions 
U/J = 4 and A fp/J = 2, and a shorter rj = 0.1 quench 
and longer window-size TJ = 15. 


The Bayesian prediction is remarkably well calibrated. 
For each of several values of /? between 1 and 50, we have 
simulated 1000 experiments of the above type. In Fig. 
0d) we plot the average mean and standard deviation 
of the inferred distribution V(/3), with the average taken 
over the different experiments. This shows that consis¬ 
tent accuracy of a few percent can be obtained even as /? 
is reduced over two orders of magnitude. We also count 
the fraction of times in which the true (5 value lies in each 
decile of the Bayesian prediction. Ideally this would be 
exactly 0.1 for each decile and our predictions conform to 
this, staying between 0.8 and 1.2. Plots of these values 
are given in the Supplemental Material. 


B. Stronger interactions 

We now study the success of the protocol when the 
bosons are more strongly interacting. In this case the 
Bogoliubov approach is not valid, and instead our analy¬ 
sis proceeds using time-evolving block decimation [36U38] 
to evolve a matrix product operator l.iOlU 1 1 representa¬ 
tion of the state of the bosons. Using this we near-exactly 
calculate the characteristic function \q(. u ) f° r the exact 
Bose-Hubbard model [Eq. 0] and interaction Hamilto¬ 
nian. See the Supplemental Material for details on this 
tensor network method [i2] . 

The results for strongly-interacting bosons, close to the 
critical point, are shown in Fig. [2] Compared to super- 
fluid bosons, we see that, despite stronger impurity-boson 
coupling, the qubit dephasing takes place over a longer 
timescale due to the absence of a broad spectrum of low- 
energy excitations. Correspondingly the work distribu¬ 
tions pq{W), shown in Fig. 0a), are more featured than 
for the superfluid case, witn positive skewness resulting 
from a high-frequency shoulder. However, as shown in 
Fig- 0b), the accuracy of the thermometry procedure 
remains largely indifferent to these changes. 


IV. DISCUSSION 

We have shown that using a non-equilibrium probe 
overcomes two challenges in the thermometry of ultracold 
gases. First, the need to precisely control the internal en¬ 
ergy of the probe on scales corresponding to the thermal 
energy of the system. Second, the need to understand the 
temperature-dependence of the system’s thermal state in 
advance. To demonstrate this, we showed that the tem¬ 
perature of bosons in a lattice could be estimated using 
the protocol, for temperatures in the nanokelvins or even 
lower. We also found that accuracy and precision are 
largely unaffected by moving close to a critical point, in 
this case the crossover between superfluid and insulating 
phases, a regime that is less well understood. 

These advantages come at a cost, namely the need 
for exquisite control over the interaction between the 
probe qubit and the cold-atom system. Furthermore, it 
must be possible to perform a quench such that the non¬ 
equilibrium work fluctuations are comparable to the ther¬ 
mal energy of the system. We have argued that atomic 
impurity probes can be expected to satisfy the aforemen¬ 
tioned requirements quite generally, and are therefore ex¬ 
cellent candidates for generic thermometry of cold atoms 
at very low temperatures. 

Instead of performing repeated measurements on a sin¬ 
gle qubit probe, multiple measurements could be per¬ 
formed simultaneously using multiple probes. These 
could be prepared in an array using a lattice and used 
either to reduce the number of measurements per probe 
or to probe a spatially varying temperature profile result¬ 
ing from e.g. heat currents during non-equilbrium trans¬ 
port [431 SI] • Alternatively, if the qubit probe is imple¬ 
mented by interfering atoms passing through the system 
at shifted times, then a steady stream of atom probes 
could near-continuously monitor the temperature of the 
system. 

An essential assumption underlying our thermometry 
protocol is that the system is in thermal equilibrium. 
However, the protocol could potentially be used to assess 
whether this is the case. The protocol is used to find the 
most likely pairs /? and A F given that the Tasaki-Crooks 
relation holds, but it could also evaluate the likelihood 
that the relation is satisfied for any /? and A F, thus al¬ 
lowing the testing of thermalization [35] , 
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SUPPLEMENTAL MATERIAL 


VI. ESTIMATING THE WORK DISTRIBUTION 


This Supplemental Material contains some additional 
details relating to the calculations presented in the main 
text and is organized in the following way. We briefly in¬ 
troduce the relationship between the characteristic func¬ 
tion and moments of the work distribution in Sec. El 
Then, in Sec. ED we give details regarding how estimates 
of the work distributions are obtained from estimates of 
the characteristic function, together with properties of 
these estimates. Section IVIII concerns how these esti¬ 
mates are used to infer the temperature, in both a fre¬ 
quent ist and Bayesian approach. Then finally in Sec. 


VIII we explain how the characteristic function is calcu¬ 


lated for the specific case of the Bose-Hubbard model, 
using Bogoliubov theory and tensor network theory. 


V. PROPERTIES OF THE WORK 
DISTRIBUTION 


In what follows we will refer to a few generic prop¬ 
erties of the work distribution Pq(W), namely its cu- 
mulants n m and principally its first k.qi = pq and sec¬ 
ond kq 2 = Oq cumulants. The cumulants KQ m of the 
work distribution Pq(W) are related to the derivatives 
of the logarithm of the corresponding characteristic func¬ 
tion xq{ u )i according to 


1 d m 

KQm = W d^ l0gX ° (u) 


u—0 


(5) 


We note two properties of the work distribution 
Pq(W ), one relevant to each of /.iq and Oq. First, the sec¬ 
ond law of thermodynamics expressed in terms of mean 
work and free energy ensures that \ip > A F < — ps, 
where A F = F(Xf) — F(Xi) is the free energy differ¬ 
ence. Second, though this is just a re-expression of 
Eq. § for m = 2, the second cumulant is related to 
the so-called dephasing time TQ deph by oq = l/rQ dep h, 
where TQ dep h is here defined by the short-time behavior 
of the dephasing function |xq(m)| = exp(—T q(u)), with 
r Q( u ) = (M/ T Qde P h) 2 /2 + 0{u 3 ). This gives us a way of 
assessing the size of the second cumulant Oq from observ¬ 
ing a small part of Xq{ u ) directly. We assume that this 
is done, at a small overhead to our thermometry protocol 
discussed in the next section, and thus oq and TQ dep h are 
quantities that are, at least approximately, known when 
implementing the protocol. 


A. Deterministic errors 

Here we discuss how to estimate the work distribution 

P Q (W) = F{ X q(u)}(W) = ±-j due- iWu XQ(n), 

from estimates of the characteristic function xq( u ) ob¬ 
tained in a (numerical or actual) experiment, where T 
denotes a Fourier transform. 

It is infeasible to study the characteristic function 
Xq{u ) continuously over all times. More realistically the 
characteristic function Xq( u j) is studied at a finite num¬ 
ber N ste ps of discrete times Uj = jAu for integer j = 
1, ..., N s te ps in some domain [— T, T] with T = N steps Au. 
Noting that xq( 0) = 1 and XQ^j) = X*Q{~ u i)i we then, 
instead of Pq(W), construct a discrete and finite-time 
Fourier transform 

a -^steps 

Pq(W) =-^ Y e~ lWui xq {uj ) w ( u j ) 

j— N steps 

! -Wsteps 

ED e~ lWuj xq (uj)w(uj ) 

where we have introduced a windowing function w(u). 

As we will now discuss, pq{W) differs significantly 
from Pq(W). However, the forward and backward dis¬ 
tributions only enter into our thermometry protocol 
through their ratio. We show below that the ratios 
Pf(W)/pb{—W) and R(W) = Pf{W) / Pb(—W) can be 
made identical, meaning we are able to work with the al¬ 
ternative distributions pq(W) rather than Pq(W). For 
the rest of this subsection, we consider how to choose T 
and N s t e ps for our protocol such that this is the case. 

The two Fourier transforms pq(W) and Pq(W) are 
related, using the Poisson summation formula and the 
convolution theorem, by 




Pq{W) = If{w{u)}{W)* Y, PQ(W + kAW) (WO, 

v k— —oo J 

where AW = 2ir/Au and * represents a convolution. 
This demonstrates the two sources of error in going from 
Pq(W) to pq{W). First, aliasing, where frequencies dif¬ 
fering by AW cannot be distinguished by looking at a 
function at a discretized set of points. Second, spectral 
leakage, where contributions from one frequency leak to 
those nearby on a scale ir/T due to the finite resolution 
offered by the window of finite size T. The former leads 
to the sum and the latter to the convolution. 

For the effects of aliasing to be small, we must have 
that oq/AW <C 1 or oqAu <C 27 t so that the widths 
oq of the work distributions is much smaller than the 
periodicity of its approximation pq(W). Another way to 





write this, in terms of the dephasing time TQdeph = 1 /ctq 
is Aw/TQ d eph < 27r or N steps > T/rQdeph• In all exam¬ 
ples used in our work, the number of time-steps N steps 
is easily large enough to ensure the effects of aliasing are 
negligible. 

The effects of spectral leakage on pq(W) will always be 
significant for a finite system with a discrete set of energy 
levels. Importantly for our protocol, the same is not true 
for their ratios, as we now show. Consider aliasing to 
have a negligible effect, then 

p F (W) _ {P{w{u)}(W)-kP F {W)}{W) 

Pb(-W) ~{P{w{u)}(W)*P b (-W)} (W) 

_{F{w(u)}{W)*P b {-W)R{W)}{W) 
{P{w(u)}(W)*P B (-W)}(W) 
_{F{w(u)}{W)*P b (-W)}{W)R{W) 
{P{w{u)}(W) * P B (-W)} (W) 
=R(W). 

In going from the second to the third line we have only 
used that R(W) = e P( w ~ AF ) does not vary too much 
on the scale of the characteristic width AW Fw ~ 7t/T 
of the smoothing kernel J 7 {w(m)}(W / ), i.e., /3AW Fw 1 
or T > 7r/3. However, even this criterion is sometimes 
overly strict and may be loosened depending on the re¬ 
lationship between ctq, /? and A W Fw . For example, con¬ 
sider the case that both pq{W) are effectively flat on 
scale A W Fw be. ctq/ AW Fw ^ 1 assuming pq{W) to be 
unimodal. Then even having j3AW Fw ~ 1 would lead 
only to small errors, as errors due to variations in R(W) 
within the convolution would largely cancel. Note that if 
an incorrect choice is made and spectral leakage does in¬ 
troduce errors, then this will be clear from the non-linear 
behavior of L(W) = ln(i?(IT)) and thus a larger T can 
be chosen. 

In this work, we find that a good rule of thumb is to 
choose T to be sufficiently long that the qubit has fully 
dephased. Specifically, for the superfluid calculations, 
we use a phenomenological and unoptimized expression 
based on the above discussion 

T = 7T0[l + (5-l)e-^], (6) 

with a — a F = ct b . The effect is that spectral leakage, 
like aliasing, has a negligible effect on errors. In a real 
experiment, (3 is not known in advance and so T must 
be chosen using the considerations above and some prior 
expectations about /?. We find that the thermometry 
protocol is typically robust to changes in T by factors of 
the order of unity. 


B. Random errors 


From now on, we assume that T and N steps have been 
chosen, such that R(W) = p f (W)/pb(—W). We focus 
on the fact that XQ( u j) will not be known exactly and 


that instead we only have access to an estimator Xq(u) 
based on expectation values estimated from a finite num¬ 
ber N meas of measurements each. We will always use a 
bar to indicate an estimate that is a random variable 
obtained stochastically from measurements made during 
the protocol. Propagating this forward according to 

( w Btep s 

1 + 25? | £ e-^XQ^M^) 

(7) 



P Q (W) = + ( 


we obtain an estimate Pq(W ) of pq(W). The remain¬ 
der of this subsection addresses how to choose a set of 
work values M4 such that pQiWk) are independent, un¬ 
biased P[pQ{Wk)} = PQiWk), and of known variance 
Var[pQ(ITfc)]. Here expectation values are always taken 
with respect to the distributions generating the measure¬ 
ment outcomes. 

Let us begin by considering how xq( u j) are estimated. 
In an experiment we estimate 


xq{u) 


gi Hu) 
2s|s4. 


«+r) + i(+y» , 


with 4>{u) = A (t + u) by first estimating expectation val¬ 
ues (ct m ), for p = x, y, with respect to the state p q of the 
qubit at the end of the interferometric protocol. Specif¬ 
ically, each (ct m ) is estimated from N meas independent 
measurements of ct m , which returns 1 or —1 with prob¬ 
ability p = (1 + (ct p ))/2 and 1 — p, respectively. Then 
the average ct m of the measurements is an estimator of 
(ct m ) that is unbiased, i.e., its mean is E[ct p ] = (ct m ), and 
has variance Var^] = (1 — (CT M ) 2 )/A meas . In accordance 
with the central limit theorem, for large enough N meas 
the estimator is normal and so its properties are fully 
characterized by its mean and variance. 

The linear combination 

e i 0O) 



is thus also an unbiased (E[xq(w)] = Xq ( u )) Gaussian 
estimator of xq ( u ) an< l h as variance (using the general¬ 
ized definition Var[+] = E [\z — E[+]| 2 ] of variance for a 
complex random variable z) 


Var[xQ(u)] 


VarfcrJ + Var[(T y ] 

4 l«t s bl 2 

2-4|s^| 2 |xQ(n)| 2 

4|s^S4.| 2 A meas 


( 8 ) 


where the first line holds due to the independence of a x 
and ct y . 

Note that this variance Var[xQ('u)] = Var[5?{xQ(u)}]-|- 
Var[S{x'Q(w)}] is divided between those of the real and 
complex parts, but how exactly this division occurs de- 
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pends on the phase 4>{u) = A(r + it), and thus A, and 
the phase of stsj.. The effect of A is quantitative but not 
qualitative, and the A for a particular implementation 
can be known. For the purposes of our plots, we assume 
4>(u) = A = 0. Further, Eq. ^ makes it obvious that 
sT and si should be chosen such that |sSsj.| 2 takes its 


maximum possible value, 1/4, and in all plots we assume 
the choice sJs^. = 1/2 is made. 

This variance propagates forward into our estimator 
Pq{W) of pq(W) according to Eq. (|7j). Since this again 
a linear sum, the estimators are unbiased and Gaussian, 
characterized by the expectations E \Pq(W)\ = pq{W) 
and covariances 


E[ (p Q {W) - p Q {W)) C Pq(W') - PQ {W'))} 


4|s|s4.| 2 A meas \2tt J 


A u\ 


) H ^)! 2 cos - W ') u j) ( 2 - 4 l s t s ll 2 |^oK ')| 2 


(9) 


4 (sls±) 2 N n 


Au V 
2?r ) 


^w 2 (u J )3?{e- i(w+w ' ) ^e 2i ^ ) sR{e- 2i ^ ) 4(s;sq) 2 XQ(%)}}. 


The first term arises from the interference of random 
deviations in xq{ u j) & f different Uj. The second term, 
which is much smaller, arises due to the fact that, to en¬ 
sure that pq(W) is real, we have used XQ(~ u j) = Xq( u j) 
rather than generate independent estimators for XQi u j) 
and Xq(-Uj). 

When W and W' are close, approximating W = W', 
we obtain the variance. The dominant term is given by 


Var [pq(W)\ 


4|4nl' 


Nrr 


Au 


au \ \ - 

2w I ^ 


W{Uj) 


x 2-4|s^s 4 .| 2 |x 0 (u i )l 


( 10 ) 


Notably, this scales as Var \pq{W)] ~ T 2 /N steps N meas . 
Compare this to the conditions T/rgdeph 2irN steps 
and T > tt/3 for avoiding systematic errors due to aliasing 
and spectral leakage. The form of Var [pq{W)\ suggests 
increasing N s tep s, which would also reduce the aliasing 
error. However, Var \pq{W)\ suggests taking a smaller 
window size T, which would come at the cost of higher 
spectral leakage. We leave a discussion of the trade-off 
of these two errors until later. 

When W and W' are separated by more than roughly 
7 t/T, the cosine term in Eq. © will oscillate rapidly 
enough that the covariance is significantly reduced. Thus 
7 t/T represents the range in W over which our estimates 
Pq(W) are correlated. Thus there is little information 
added by generating estimates Pq(W) for W separated 
by less than this correlation range n/T. As a result, 
in our inference protocol we consider only the values 
pQ{Wk) taken at an array of points Wk = kn/T sep¬ 
arated by this amount. These values should only be 
weakly-correlated. Conveniently, but not crucially, these 
are exactly the same values Wk for which pQ{Wk) can be 
found using the fast Fourier transform. This discussion 
suggests increasing T in order to increase the density of 


points Wk- Again, a discussion of the trade-off of this 
with the other factors affecting the choice of T is left 
until later. 


VII. INFERRING THE TEMPERATURE 


Here we detail the core of the thermometry protocol, 
giving the precise procedure to go from the estimates of 
the work distributions pg(Wfc) discussed in the previous 
section to an estimate of the inverse temperature /3. Note 
that the protocol uses no knowledge specific about the 
distributions Pq {W) and characteristic functions xq { u ) 
other than the width cfq and dephasing time rgdeph de¬ 
scribed above. 


A. Outline 

The essential fact on which we base our inference is 
that 

L(W) = In (R(W)) = In j j = 0(W - AF), 

* 11 ) 

where we refer to the work distributions pq(W), ratio 
R(W) and log-ratio L(W ) discussed in the previous sec¬ 
tion. 

The information we collect during our protocol is a set 
of near-independent estimates pQ(Wk) of pq{Wu) and 
their variances at a discrete set of energies Wk- The 
essential idea is to use these estimated values of the work 
distribution, together with our knowledge of how they 
relate to /3 and A F via Eq. ( |TT| ) , to infer the values, or 
distribution of values, of /3 (and A F if desired). 

We consider two approaches to this inference problem, 
frequentist and Bayesian. The Bayesian approach is used 
for all results we present in the main text, but methods 
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along the lines of the frequentist approach are more com¬ 
monly found in the literature. Our presentation of the 
frequentist approach thus serves to highlight the bene¬ 
fits our Bayesian approach, and also provides a simpler 
setting in which to analyze the dependence of errors on 
parameters e.g. T used in the protocol. 


1. Frequentist 

The frequentist approach is most similar to how A F 
has previously been estimated from work distributions. 
The approach is to use jjq {Wk) to obtain estimates 
L(Wk) of L(Wk) for many values of work Wk- For now, 
let’s assume, these estimates L(Wk) are independent, un¬ 
biased E [L(Wk)\ = L(Wk ), and Gaussian with known 
variance Var \L(Wk)\- Assuming this, the knowledge of 
Eq. @ means that we can use weighted linear regres¬ 
sion to construct the maximally likely f3 and A F as the 
pair of values that minimize the least-squared error 

y (L(W k ) - /3(W k - AF)) 2 
V Var[L(Wfc)] 5 1 J 

and take them as our estimates of /3 and A F. This is 
shown in Fig. [3] The procedure then boils down to ob¬ 
taining estimates L{Wk) whilst ensuring no bias and es¬ 
timating Var[L(Wfc)]. 

As a starting point, consider the estimator L'(W) = 
In {Pf(W)/Pb(—W)}. Due to the non-linear nature of 
the inverse and logarithm functions, L'(W) is neither 
Gaussian nor an unbiased estimator of L(W). The 
errors resulting from the bias can be small, but ide¬ 
ally we would like to construct an unbiased estima¬ 
tor L{W) = L'(W ) — AL(W), which removes the bias 
A L(W) = E [L'(W)\ — L(W). We would also like to 
characterize the remaining mean-zero random errors in 
L(W) and find when they are approximately Gaussian. 

To do this, we approximate the bias and variance by 
simulating the sampling of Pq(W), by adding Gaussian 
noise of zero mean and variance Var [pq{W)\ to Pq(W) 
before taking the logarithm. From these values, we then 
estimate the bias A L{W) and variance Var[L(W)], and 
assess the non-Gaussian character of the distribution of 
L(W). As is expected, we find L(W) is reasonably Gaus¬ 
sian when Var \Pq(W)\/Pq(W) is small. For larger values 
the distribution is very skewed and least-squares min¬ 
imization may not correspond to the maximally likely 
parameters. 

A further flaw in this approach is the possibility of 
obtaining negative values Pq{W), whose logarithm is 
undefined. We simply ignore values of Wk for which 
pF^Wk) or Pb{~ JVfc) is negative, and we ignore nega¬ 
tive values arising in our estimate of the bias and vari¬ 
ance. This comes at a cost of potentially inducing a bias. 
Balancing the need to avoid such a bias with the de¬ 
sire to have as many points as possible for the fit, in 
our frequentist calculations presented here we only con¬ 


sider values Wk where N&v\p F (Wk)\/p 2 F {Wk) > 1 and 
Var[ps(— Wk)]/p 2 B {—Wk) > 1. The results in Fig. [4 show 
that this bias is perhaps acceptable but not small. A 
better approach for dealing with negative values is the 
Bayesian approach of the next subsection, which makes 
full use of the information provided by obtaining a nega¬ 
tive value for pF{Wk ) or p n{—Wk) - For now, we continue 
our analysis of the frequentist approach ignoring any bias 
introduced by these negative values. 

We have shown how to choose a set of Wk for which we 
have approximately independent, unbiased, and Gaus¬ 
sian random estimators L{Wk) of L{Wk) whose variances 
Var[L(Wfc)] we know approximately. From these, we then 
obtain a maximally likely (3 and A F as the pair of values 
that minimize the least-squared error of Eq. (12) and take 
them as our estimates of 8 and A F. Let us now turn to 
discussing how the variance in /? should behave, including 
how it depends on some of the parameter choices we make 
in our protocol, particularly T, since the dependence of 
the error on fV meas and N steps is clear. 

It is well known that, for least-squares estimation, the 
fitted parameter /3 provides an unbiased estimate of [3 
with variance 


Var[/3] = 


ftv 2 / ft 


Vft 

- (ftv/ft) 2 ’ 


* ^ Var [L(W k )V 

_ ST' Wk 

^ V v “ \L(Wk)Y 

y- Wl 

^ A. Var [L(W k )Y 


We obtain a simpler expression for the fractional error 

x/Var PI y/Var [L]/N w 

P pA W k ’ 

that qualitatively captures the basic behavior if we as¬ 
sume uniform variance Var \L(Wk)\ ~ Var[Z] with Var[L] 
the typical value of the variance over the values of Wk 
used. Appearing in th e denominator of this equation is 
the spread AW fe = y/J2k W%/N w - (J2 k W k /N w ) 2 of 
the values Wk, where we have written Nw = J2k f° r th 6 
number of points used in the estimation. 

We can now insert some of the findings from the er¬ 
ror analysis of the previous sections into this expres¬ 
sion. First, demanding independence of points required 
us to choose that Wk were spaced by 7r/T and so Nw ~ 
AWkTj7 r, giving us 

x/VaZpI ^Var [L\/T 

P ~/?(A W k ) 3 / 2 ' 

Second, the range of work values A Wk we in¬ 
clude is the range satisfying Na,r[p F {Wk)\/p 2 F {Wk) > 
1 and Var[ps(— Wk)}/p 2 B {—Wk) > 1, which corre- 
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FIG. 3: Superfluid phase; frequentist. (a) The logarithms 
\yi{pf(W)} and ln{p_B(— W)} of the work distributions and 
their difference L(W) = ln{pi?(W)} — \n{p B (—W)}. Solid 
lines are known values, points are estimates obtained in a 
single simulated experiment, and error bars calculated from 
those estimates mark expected errors of one standard devia¬ 
tion. The black dashed line is obtained from a weighted least 
squares fit. (b) A histogram of f3 estimates obtained in 1000 
simulated experimental runs, together with their mean pp and 
standard deviation 073 , and Gaussian fit. All parameters are 
identical to those for Fig. 1 in the main text. 


sponds to ensuring that T 2 /N meas N steps p 2 F (W k ) and 
T 2 /N megbS N steps p F (—W k ) are smaller than a threshold 
amount. This tells us that this range A W k can decay 
quickly with increasing T if Pq(W) depends sharply on 
W at the edge of the region W k ■ In turn this means that 
T should be chosen to be as small as possible without in¬ 
troducing spectral leakage, which is the solution to one of 
the main questions remaining from the above discussion. 
We found these optimum values of T to behave roughly 
as Eq. §, essentially linearly in (3, leaving us with 

\/VarpJ \/Var [L] 

P ~ (flAW k ) 3 / 2 ' 

It is difficult to ascertain from this simplified analysis 
how this fractional error depends on (3. In our examples 
we find that the relevant width AW k of work values de¬ 
creases roughly linearly with j3 for large j3. The variances 
captured by Var[F] are the deciding factor. We find that 
in our examples they increase with f3 2 as might be pre¬ 
dicted from Vsl,y[pq(W)\ ~ T 2 and our setting of T ~ f3. 
Thus the fractional variance of our f3 estimate increases 
roughly linearly with (3 [Fig. [ 4 ]. 


2. Bayesian 

The Bayesian approach is to capture probabilistically 
the knowledge we have about [3 and AF conditioned 
upon obtaining estimates Pq{W). It also allows us to 
include our prior expectations about the system into 
the statistical analysis. However, we do not make use 
of that feature here, assuming nothing about the sys¬ 
tem. Further, the Bayesian approach outputs a proba¬ 
bility distribution of the values of (3 and A F, which need 
not be Gaussian, rather than only returning maximally 


FIG. 4: Accuracy at low temperatures; frequentist. Over 1000 
simulated experiments for each j3, the average fractional stan¬ 
dard deviation ap/ [3 and bias {pp — (3)/f3. All other parame¬ 
ters are identical to those for Fig. 1(d) in the main text. 


likely parameters and an estimate of their variance. The 
Bayesian approach therefore can use and provide more 
information than the frequentist approach. 

In the following we use the shorthand notation pf = 
p F {W k ), p B = p B {~W k ), R = R(W k ) and W = W k . We 
write O k to denote the observations of estimates p F and 
Pb for some value of k and O for the combined set of 
observations. 

The Bayesian approach is based on the expression 


V(J3,AF\0) 


V{Q\f3,AF)V(/3,AF) 
f d/3dAFV{0\{3, AF)V(/3, AF) ’ 1 ’ 


for our assessment of the probability of the system having 
temperature [3 and free energy difference AF given obser¬ 
vations O. It uses our assumptions about V{0\(3, AF), 
the probability we would have obtained those observa¬ 
tions for all possible values of /? and AF, and the prior 
V(/3, AF), capturing our knowledge of the system in ad¬ 
vance of the experiment. 

In this paper we make what is called a null prior, set¬ 
ting V(/3,AF) to be constant, essentially assuming we 
have no information about (3 and AF. An experimen¬ 
talist who does have more prior information could easily 
adapt our approach to include that information in the 
inference scheme. 

We are left then to come up with an expression for 
the conditional probability V{0\f3, AF), which, assum¬ 
ing independence, is just a product of the conditional 
probability of obtaining pairs of observations at each W. 
The next few paragraphs deal with the evaluation of this 
conditional probability. We perform this evaluation by 
breaking it up into several parts, in three stages. First 
we use that 


np F ,p B \/3,AF) = J dRV{p F ,p B \R)V{R\fl,AF) 

= V{P f ,Pb\^ w ~ AF) ). 

Here we have conditioned the obtaining of the estimates 
p F and pb on the ratio R of their true values, and used 
fact that this ratio is always equal to e^ 44 ~ AF \ It is this 
piece of information that is the key to the whole protocol. 
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The second step is to use 

'P(Pf,Pb\R) = J ^PF^Pb'P{pf,Pb\PF,Pb)'P{pf- i Pb\R)- 

Here we have conditioned on the exact values of the work 
distributions pq. We do this because we know that Pq 
are unbiased and Gaussian with known variance ctq i.e. 
V(Pf,Pb\pf,Pb) = V{Pf\pf)V{Pb\pb) with 


This leaves us having to define some prior V(pf,Pb) rep¬ 
resenting our prior knowledge of the work distribution 
values. We again essentially assume no prior knowledge, 
assuming independence V(pf-,Pb ) = V{pf)'P(pb ) and a 
uniform distribution over positive values 

h if 0 < PQ < c 

0 otherwise 



R(pq\pq) = 


3 -(pq-pq) 2 /2o-? 


2it<j 2 q 


for some C taken to be much bigger than all other values. 


The distribution V(pf-,Pb\R) of the work distributions 
given their ratio can be built in our third and final step. 
We use Bayes’ law again 

v( | m 'P(R\pf,Pb)V(pf,Pb) 

F ' B $ &Pf&Pb'P(R\Pf,Pb)'P{Vf,Pb) ’ 

since we know that 

V{R\pf,Pb) = S(R- pf/pb) = PbS(Rpb ~Pf)- 


We now have all we need. Collecting all of the above 
together, after performing two integrals, we obtain 

V(pf,Pb\R) = 

2 max{l, 1 + R 2 }p B 5(Rp B ~ Pf)V{pf)V{pb), 

and 


R{Pf,Pb\P, AF) = 
max{l, 1 + R 2 } 
2nC 2 R 


exp 


p9_ 

2 o' 2 


Pb_ 

2a|j 


O pO B 
™FB 


exp 


(p' F — Pb) 2 

2 a FB 


pWr 


- PbO’f 


\f2ito' 
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FB 


1 + erf 


( p'f°b +PBCrp \ 

y \/2 ^ y o' f>o y j, J 


where we have used the shorthand p' F = pf/R, er ' F = 
of/R and a FB = a' F + a 2 B - 

All of this can be fed back into our expression for 
V(/3, AF\0) [Eq. (13)] and allows us to plot this dis¬ 
tribution and calculate its properties. Upon testing, the 
Bayesian prediction is found to be remarkably well cali¬ 
brated. We have performed 1000 simulated experiments 
and the fraction of times in which the true /? value lies in 
each decile of the Bayesian prediction. Ideally this would 
be exactly 0.1 for each decile and our predictions conform 
to this, staying between 0.8 and 1.2 for /3 between 1 and 
50, as shown in Fig. [5] 


VIII. THEORETICAL VALUES FOR X q(u) 



Decile 


FIG. 5: Accuracy and precision; Bayesian. The fraction, over 
1000 simulated experiments, of experiments in which the true 
value of /3 lies in the deciles of V(/3) predicted using Bayesian 
analysis. The flatness of the plots, especially for the lower 
temperatures, reveals that our null priors are well calibrated. 
All parameters are identical to those for Fig. |4] 


In this part of the supplemental material, we provide 
details of the two methods used to calculate the charac¬ 
teristic function (as well as other relevant quantities) of 
our example system. Specifically, we calculate 

XQ(u)=tr{u^(X Q (T),u)U Q U(X Q (0),u)pp(XQ(0))}, 

(14) 
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where 


Uq =T exp 


-i fdtH(X Q (t)) , 
J o 


U{X,u) 

frW =e~ pAw /Zp{ A), 
Zfs{\) =tr{exp [-0H (A)]}, 
H{ A) =H S + XV. 


(15) 


We begin by discussing the Bogoliubov treatment, rele¬ 
vant to the superfluid phase, and then move on to the 
tensor network approach that is necessary when interac¬ 
tions are stronger. 


A. Bogoliubov treatment 

1. The free energy 

In the Bogoliubov treatment [35] we have the simplified 
Hamtilonian H{X) = Hs + AH, where 


phonons present in the initial state p/s(Xq(0)) into the 
Hamiltonian H{ A). We do this using a method similar 
to that used for the free energy. We define a new dis¬ 
placed creation operator d) k = b\ + Aq(0 )77 / c /w/ c and its 
annihilating conjugate dk■ In terms of these operators, 
we re-write the Hamiltonian as 


H( X) =H'(X') = H' s + X'V', 
X' =X — Aq(0), 

H' s = ^2 Ukdldk, 


V' =i)n! + ^(^4 + h - c -), 

k 


n' =n — 2 Aq(0 ) ^ 
k 


M 2 

VU k 


Here n! represents the reduced background density due 
to the initial perturbation, and we have omitted a con¬ 
stant term XQ(0)r](n — Aq(0) Y2k Idkl 2 / pojk) representing 
its energy. 

We now have 


H s = uikbj.bk, 
k 

V =?7 n + J2^k b l + Vkh)- 

k 

Here b\ and b k , introduced in the main text along with 
the other terms, satisfy bosonic commutation relations. 

This Hamiltonian can be diagonalized by defining a 
new displaced creation operator d) k = b k + Xrjk/oJk, and 
its annihilating conjugate dk, leading to 

H(X) = ^ u> k d\d k + A ?7 | n - X ^ ) • 

k \ k 1 7° Jk ) 


Xq(u) = tr (Xq(t), u)UQP'pity} > 

where 

Uq =T exp[—i [dtH'(X' Q m, 

Jo 

U\X',u) =e~ ifl 'U')u , 

Pp( X') =e-P*'U')/Z' f} ( A'), 

Z’p(X') =tr{exp[-£ff'(A')]}. 

Note that, by design, Aq( 0) = 0 and thus we have used 
U'(Xq(0),u) = U'(0,u) = 1. In what follows, for clarity, 
we drop the primes. 


We can thus immediately write the free energy as 


F{\) = ~]nZ p (\) 

= Xr)ln-X^2 

\ k 

F( 0) = — ^ lntr | exp 


\Vk\ 2 

V Uk 


F( 0), 


-0^2 Ukd\d k 




,-pUk 


)• 


2. The characteristic function 


Our first step in evaluating \q (it) [Eq. (14)] is to sim¬ 
plify by absorbing any displacement of the Bogoliubov 


Our second step is to move to the interaction picture, 
whereupon 

XQ H = tr { U ] Q (0)^ (Aq(t), u, t)U q { u)pp (0) j . 

Here the tilde indicates an operator in the interaction 
picture, specifically 


UQ(t) =T exp 
U(X,u,t) =T exp 


-i dt'X Q (t')V(t'+ t) 
Jo 


-iX dt'V(t' + t) 


V(t) = e l Fsty e -iHst 

=v n ++ h - c -)- 

k 

The third step is to simplify the time-ordered expo- 
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nentials. For this we appeal to the Magnus expansion 


U =T exp 


-i dt'X(t')V(t') 


= e 


-i A 


in terms of Hermitian operators 


A =A [1] + A [2] -)-, 

= (dt'X{t')V{t'), 

Jo 

a [2] =i± fda fdt"x(t')x(t")[v{t'),v{t'% 

1 Jo Jo 


The simplifying feature of the Bogoliubov Hamiltonian, 
which we will use repeatedly, is that the commutator 

[V(t),V(t')] = -2iJ^ M 2 sin (u> k (t - 1?)), 

k 


is a pure imaginary c-number. Hence we have omitted 
the tilde from A^ to highlight that it is a real c-number 
only. This also ensures that all terms A ^ for m > 2 
vanish from the Magnus expansion. The result is that 

XQ(u) =e iAl2] ^ 

x tr{e i ^ 1 ( 0 ) e il[ 11 ( A oW’“’ r) e“ a « ( “ ) ^( 0 )} . 

Here, we have used a simplification of the form 

e lA Q l°) e _lA Q (“) = i ) and the other integrals appearing 
in the expression are as follows 


= ldt'X Q (t')V(t' + t) 
Jo 

=r)n fdt'XQ(t') 

Jo 

+ £ ( 7T A Qfc^ 


—A* nk d\,e iuikt + h.c. 

Wfc 


A Q k /dfAq(f)e 

Jo 

f u 

A^(\,u,t) =A dt'V(t' + t) 
Jo 


—iuikt 


=r]nXu 


+ £ A A* fc (A,u)4e i ^ t +h.c. , 
, \ w fc 


A fc (A ,u) =ui k X dte 


—ic Jkt 


= - iA(l - e~ 1UkU 
2 

Vk ' 


Hl 2 l(A,u) = -A 2 ^ 


Wfc 


(wfcU - sin(wfcu)). 


Having simplified each time-ordered exponential, our 
third step is to combine them using the Baker-Campbell- 
Hausdorff formula 

e A e B = e A+B+[A,B]/2^ 


for the case that [A, B] is a c-number. Explicitly, we use 


e'A ^ 1 ( 0 ) e L 4 [1] (A Q (r),M,r) e -iAW (u) 


— («))+iA [11 (A q(t),u,t) 

x exp { - ^([Aq(w), Aq(0)] 

+ [Aq ] {u),A [ 1 ] (Xq(t),u,t)} 

+ [ig 1 ( 0 ),AW(A Q (r),u,T)])}, 


together with 


[A [ $(t),A%(t')} 


l 1 ! C+'A 


= -*£ 


Vk_ 

Wfc 


|AQ fc | 2 sin(w fc (t - t')), 


= - 2i £ ^ 9 { A Q* Afc(A, u)e^*-^ } . 


This leaves us with 
XqH = exp (upiX Q (t )u) 

x exp ( —iXq(t) 


where 


V k_ 
Wfc 


(wfcU - sin(w fc u))J 


ex P ( -i£ 


Vk_ 

OJk 


hQk(u ) 


x trjexp ^-i£ [^ 9 QkHd{ + h.c. j J ^(O) j , 

(16) 


h Qk {u) = - |A Qfc | 2 sin(w fc 'u) 

- ${a^ (e^“ + 1) Afc(Ag(r),lOe"^} 

=HQ k sin(uj k u), 

H Qk = - |A Qfc | 2 - 2A Q (r)s{A^e- i “^}, 

g Qk {u) = - A Qk (1 - e~ iUkU ) - Afc(Aq(r), u)e~ iWkT 
=G Qk (1 - e-^“) , 

GQ k = - A Q k + iXQ(r)e~ lulkT . 

The final step is to evaluate the trace tr{-p^(0)}, or ex¬ 
pected value (•) with respect to state pp(0). We use that 
the exponent of the first term in the trace contains only 
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terms that are linear in d\, and d k ■ The state pp( 0) 
with respect to which the expected value is taken, the 
other term in the trace, consists of a mixture of differ¬ 
ent occupations of these phonon modes, with all non¬ 
number conserving combinations of d\ and dk thus hav¬ 
ing zero expected value and ((d\.dk) m ) = {n k ) m with 
rik = (exp(ySwfc) — 1) _1 the mean occupation. This means 
that for an operator formed for linear combinations of dj, 
and dk, we have 



=exp (-5 ((e + hc )) }) ’ 


and the particular expectation value is 


-7 L <?Qfe( u )4 + h - c- 




=£ 


iik_ 

Wfe 


\9Qk(u)\ coth 


(%*)• 


where we note that 


\9Qk{u)\ 2 =4 \GQk\ 2 sin 2 . 


Inserting this back into Eq. (16) we arrive at our final 
expression 


*l n Xo(w) = -VnXQ(T) u + ^2 — (Xg^UkU + (H Qk - \q{t)) sin(w fc w)) - 2i^ 

iOk J 


*7fc 

u k 


\G Qk \ 


sin 


f WfeU 1 

^ 2 J 


tanh 


'fak \' 


3. The work distribution 

As discussed above, the cumulants K m of the work dis¬ 
tribution Pq(W) are related to the derivatives of the log¬ 
arithm of its characteristic function, by Eq. |5). We may 
use this to calculate all cumulants of the work distribu¬ 
tion, but here we report only the first three 

MQ = kqi =Xq(t)tiu - V n Qk , 
k “ k 

a Q = KQ 2 = £ |*7fc| 2 |G Qfc | 2 coth 

k 

IVkl 2 u k H Qk . 

k 



4- Symmetry 

A close inspection of the results reveals that reversing 
the quench has no effect on either HQ k or \GQ k \ 2 and 
thus also on the cumulants KQ m for m > 1. This rep¬ 
resents the fact that in the Bogoliubov description the 
deviations of the condensate, assumed small, do not to 
interact resulting in a symmetry between repulsive and 
attractive interactions. This means that the only differ¬ 
ence between the two characteristic functions lies in the 
phase i)nXQ(r)u. The effect of this is merely to shift 
the associated probability distributions by the amount 
* 7 nAg(r), as we can see in our expression for fiQ. Eval¬ 
uating these phase shifts, we find that, for the Bogoli¬ 


ubov case, the forward and backward distributions are 
identical up to a relative shift Pf — pb = 2 A F, where 
A F = F(Xf) — F(Xi) is the free energy difference. 

This symmetry could be exploited to provide a better 
estimate for [3. However, we do not do this here as we 
wish to keep our estimation protocol general, so that it 
is applicable to situations where this symmetry does not 
exist or is not known to exist. 


B. Tensor network theory 


In regimes where the condition necessary for the va¬ 
lidity of Bogoliubov theory Un C J is not satisfied, 
we numerically calculate the characteristic function us¬ 
ing a tensor network representation. We proceed by 
writing the quantum state as a matrix product opera¬ 
tor (MPO) [HSt [40| and then performing imaginary- and 
real-time evolution using the time-evolving block deci¬ 
mation (TEBD) algorithm |.'I61 [571 . In the following, we 
concisely outline our method. See Ref. [41] for a detailed 
pedagogical introduction to tensor-network algorithms. 

The state of a quantum lattice system with M sites 
and local Hilbert space dimension d can be represented 
in MPO form 


d d 

p= £ tr 


A^ 1 ’ ■■■ A 


M 


(17) 


x |*i ■ ■ -i M )(ji ■ ■ ■ Jm |• 


Here, the matrix elements of the density operator are 
















16 



FIG. 6: (a) Schematic depicting one part of a tensor network 
representing the time evolution of an MPO (red circles) gen¬ 
erated by a product of two-site gates Ui,i +1 and Vi.i+i acting 
on the MPO from the left and right, respectively, (b) Time 
evolution proceeds by contraction of nearest-neighbor pairs 
of MPO tensors A 1 A 1+1 with two-site gates above and below, 
resulting in a new tensor This tensor is then decom¬ 

posed into a product of tensors A[ A' l+1 via an SVD. Retaining 
only the largest D singular values in the SVD maintains an 
efficient MPO representation at each time step. 


given by the trace over a product of matrices A < ' l " : ' l> of 
maximum dimension D x D, where the bond dimension 
D quantifies the correlations between lattice sites. The 
states {|b)} constitute a complete orthonormal basis for 
the Hilbert space on lattice site l. The set of matrices 
A^ 1 ^ may also be considered as elements of a single com¬ 
bined tensor Ai of dimension D x dx D x d. Together the 
tensors Ai can represent an arbitrary quantum state, for 
sufficiently large d and D. In a bosonic system, however, 
both the physical dimension d and the bond dimension 
D must be truncated, in general, in order to obtain a 
tractable numerical representation. In our calculations 
we use d = 4 and D = 200. 

The time-evolution operator over a small time step St 
is approximated by a product of two-site gates using a 
second-order Suzuki-Trotter “staircase” decomposition, 
see Ref. [38] for details. The tensor network represent¬ 
ing the time evolution of the quantum state is depicted 
schematically in Fig. |6](a). The TEBD algorithm pro¬ 
ceeds by sweeping across this tensor network and ap¬ 
plying two-site gates sequentially to pairs of nearest- 


neighbor tensors AiAi + i appearing in the MPO repre¬ 
sentation of the quantum state in Eq. ©■ The result is 
a new tensor 0/ ; +1 formed by contracting the A tensors 
with two-site gates above and below. A singular value 
decomposition (SVD) is then performed on O/^+i, re¬ 
sulting in a new pair of tensors A[A[ +1 [Fig. |6Vb)] . Only 
the largest D singular values at most are retained after 
the SVD, so that an efficient MPO representation of the 
quantum state is maintained at each time step. Repeat¬ 
ing this procedure across the entire system over N small 
time steps St leads to the desired numerical evolution 
over a time duration t = NSt. 

In order to find the initial thermal state, we use the 
identity 


The right-hand side of this equation can be calculated us¬ 
ing the TEBD algorithm as outlined above, after a Wick 
rotation to imaginary time t —¥ —i/3/2 and taking the 
initial state to be the system-wide identity operator 1. 
The MPO representation of the identity operator is given 
simply by A^ = S iljr 

The characteristic function [Eq. fl4| )] may then be cal¬ 
culated by performing real-time evolution to find the op¬ 
erator 

X Q {u) = U Q e- iu ^ XQ ^p 0 (X Q (O))Ule iu6 ^ Q ^\ 


such that Xq( u ) = tr j Xq(u) |. The unitary Uq = 

Texp[—i fj dtH(\Q(t))] [Eq. (15)] is performed by dis¬ 
cretizing the quench path intosteps A Q m = Aq (rnSt). 
with to = 1,..., M and r = MSt. Discrete time evolu¬ 
tion under the TEBD algorithm reproduces the quench 
unitary, since 


Uq~ J] 

m=M 

for sufficiently small St. 

In order to ensure numerical stability, each matrix 
Aj 11 ^ is divided by its matrix norm (the square root 
of the sum of its elements) after each time step. The 
accumulated product of these norms then multiplies ex¬ 
pectation values to give the correct final result. 


















